Speeding up the Convergence of Real-Time Search: Empirical Setup and Proofs
نویسندگان
چکیده
This technical report contains the formal proofs for all of our theoretical results, as well as a description of our experimental setup for all of the results given in our AAAI-2000 paper entitled Speeding up the Convergence of Real-Time Search. In that paper, we propose to speed up the convergence of real-time search methods such as LRTA*. We show that LRTA* often converges significantly faster when it breaks ties towards successors with smallest f-values (à la A*) and even faster when it moves to successors with smallest f-values instead of only breaking ties in favor of them. FALCONS, our novel real-time search method, uses a sophisticated implementation of this successor-selection rule and thus selects successors very differently from LRTA*, which always minimizes the estimated cost to go. Our approach opens up new avenues of research for the design of novel successor-selection rules that speed up the convergence of both real-time search methods and reinforcement-learning methods. Indeed, our AAAI-2000 paper presents experiments in which FALCONS finds a shortest path up to sixty percent faster than LRTA* in terms of action executions and up to seventy percent faster in terms of trials. In this report, we first describe our experimental setup and then prove that FALCONS terminates and converges to a shortest path.
منابع مشابه
Speeding up the Convergence of Real-Time Search
Learning Real-Time A* (LRTA*) is a real-time search method that makes decisions fast and still converges to a shortest path when it solves the same planning task repeatedly. In this paper, we propose new methods to speed up its convergence. We show that LRTA* often converges significantly faster when it breaks ties towards successors with smallest f-values (a la A*) and even faster when it move...
متن کاملValue Back-Propagation versus Backtracking in Real-Time Heuristic Search
One of the main drawbacks of the LRTA* real-time heuristic search algorithm is slow convergence. Backtracking as introduced by SLA* is one way of speeding up the convergence, although at the cost of sacrificing first-trial performance. The backtracking mechanism of SLA* consists of back-propagating updated heuristic values to previously visited states while the algorithm retracts its steps. In ...
متن کاملPrioritized-LRTA*: Speeding Up Learning via Prioritized Updates
Modern computer games demand real-time simultaneous control of multiple agents. Learning real-time search, which interleaves planning and acting, allows agents to both learn from experience and respond quickly. Such algorithms require no prior knowledge of the environment and can be deployed without pre-processing. We introduce PrioritizedLRTA*, an algorithm based on Prioritized Sweeping. This ...
متن کاملA fuzzy mixed-integer goal programming model for a parallel machine scheduling problem with sequence-dependent setup times and release dates
This paper presents a new mixed-integer goal programming (MIGP) model for a parallel machine scheduling problem with sequence-dependent setup times and release dates. Two objectives are considered in the model to minimize the total weighted flow time and the total weighted tardiness simultaneously. Due to the com-plexity of the above model and uncertainty involved in real-world scheduling probl...
متن کاملIntegrated JIT Lot-Splitting Model with Setup Time Reduction for Different Delivery Policy using PSO Algorithm
This article develops an integrated JIT lot-splitting model for a single supplier and a single buyer. In this model we consider reduction of setup time, and the optimal lot size are obtained due to reduced setup time in the context of joint optimization for both buyer and supplier, under deterministic condition with a single product. Two cases are discussed: Single Delivery (SD) case, and Multi...
متن کامل